A Practical Algorithm for Intersecting Weighted Context-free Grammars with Finite-State Automata

نویسنده

  • Thomas Hanneforth
چکیده

It is well known that context-free parsing can be seen as the intersection of a contextfree language with a regular language (or, equivalently, the intersection of a context-free grammar with a finite-state automaton). The present article provides a practical efficient way to compute this intersection by converting the grammar into a special finite-state automaton (the GLR(0)-automaton) which is subsequently intersected with the given finite-state automaton. As a byproduct, we present a generalisation of Tomita’s algorithm to recognize several inputs simultaneously.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Intersecting Hierarchical and Phrase-Based Models of Translation: Formal Aspects and Algorithms

We address the problem of constructing hybrid translation systems by intersecting a Hiero-style hierarchical system with a phrase-based system and present formal techniques for doing so. We model the phrase-based component by introducing a variant of weighted finite-state automata, called σ-automata, provide a self-contained description of a general algorithm for intersecting weighted synchrono...

متن کامل

Context-Free Recognition with Weighted Automata

We introduce the deenition of language recognition with weighted automata, a generalization of the classical deenition of recognition with un-weighted acceptors. We show that, with our definition of recognition, weighted automata can be used to recognize a class of languages that strictly includes regular languages. The class of languages accepted depends on the weight set which has the algebra...

متن کامل

Chapter 9 R EGULAR A PPROXIMATION OF C ONTEXT - F REE G RAMMARS THROUGH T RANSFORMATION

We present an algorithm for approximating context-free languages with regular languages. The algorithm is based on a simple transformation that applies to any context-free grammar and guarantees that the result can be compiled into a finite automaton. The resulting grammar contains at most one new nonterminal for any nonterminal symbol of the input grammar. The result thus remains readable and ...

متن کامل

Finding the Most Probable String and the Consensus String: an Algorithmic Study

The problem of finding the most probable string for a distribution generated by a weighted finite automaton or a probabilistic grammar is related to a number of important questions: computing the distance between two distributions or finding the best translation (the most probable one) given a probabilistic finite state transducer. The problem is undecidable with general weights and is NP-hard ...

متن کامل

Parsing with Pictures

The development of elegant and practical algorithms for parsing context-free languages is one of the major accomplishments of 20 century Computer Science. These algorithms are presented in the literature using string rewriting systems or abstract machines like pushdown automata, but the resulting descriptions are unsatisfactory for several reasons. First, even a basic understanding of parsing a...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011